Role of Emoticons in Sentence-Level Sentiment Classification

نویسندگان

  • Martin Min
  • Tanya Lee
  • Ray Hsu
چکیده

Automated sentiment extraction from social media is enabling technology to support gathering online customer insights. The basic sentiment extraction is semantic classification of a text unit as positive or negative using lexical and/or contextual clues in a natural language system. From the input side, it is observed that social media as a sub-language often uses emoticons mixed with text to show emotions. Most emoticons, e.g. :=), are not natural language words, but textual symbols using characters to present a smiley face. Intuitively, such symbols are innately associated with emotions, whether happy, annoyed or don’t care, hence important clues for helping sentiment classification. Previous research has involved the limited use of emoticons as noisy labels in sentiment learning but detailed study on how noisy or useful they are has not been done. This paper presents a comprehensive data analysis study of the role of emoticons in sentence level sentiment classification. Various investigations are conducted on a fairly large annotated social media corpus, selected by our consumer insight analytics system. This corpus consists of 40,548 sentiment-rich sentences which business users are truly interested in mining. The study shows that the consistency between positive/negative emoticons with human judgment in this corpus is as high as 75.2%. Another larger randomly selected corpus consisting of 300,000 sentences from social media shows its consistency with human judgment to be 40.1%. A further study finds that emoticons’ recall contribution to sentiment classification is moderate, nevertheless, the data containing emoticons and brands are guaranteed to be quality social media representing customers’ voice instead of businesses’ voice such as press news. In addition, emoticon is an additional factor to help extract sentiments where other linguistic clues are insufficient.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Emoticons in Polarity Classification of Text

With people increasingly using emoticons in written text on the Web in order to express, stress, or disambiguate their sentiment, it is crucial for automated sentiment analysis tools to correctly account for such graphical cues for sentiment. We analyze how emoticons typically convey sentiment and we subsequently propose and evaluate a novel method for exploiting this with a manually created em...

متن کامل

Linguistically Regularized LSTM for Sentiment Classification

This paper deals with sentence-level sentiment classification. Though a variety of neural network models have been proposed recently, however, previous models either depend on expensive phrase-level annotation, most of which has remarkably degraded performance when trained with only sentence-level annotation; or do not fully employ linguistic resources (e.g., sentiment lexicons, negation words,...

متن کامل

Necessity of Feature Selection when Augmenting Tweet Sentiment Feature Spaces with Emoticons

Tweet sentiment classification seeks to identify the emotional polarity of a tweet. One potential way to enhance classification performance is to include emoticons as features. Emoticons are representations of faces expressing various emotions in text. They are created through combinations of letters, punctuation marks and symbols, and are frequently found within tweets. While emoticons have be...

متن کامل

Lexicon-enhanced sentiment analysis framework using rule-based classification scheme

With the rapid increase in social networks and blogs, the social media services are increasingly being used by online communities to share their views and experiences about a particular product, policy and event. Due to economic importance of these reviews, there is growing trend of writing user reviews to promote a product. Nowadays, users prefer online blogs and review sites to purchase produ...

متن کامل

Domain-Specific Sentiment Classification for Games-Related Tweets

Sentiment classification provides information about the author’s feeling toward a topic through the use of expressive words. However, words indicative of a particular sentiment class can be domain-specific. We train a text classifier for Twitter data related to games using labels inferred from emoticons. Our classifier is able to differentiate between positive and negative sentiment tweets labe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013